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Abstract 

Hematopoietic stem cell (HSC) therapy using replication-incompetent retroviral vectors is a 
promising approach to provide life-long correction for genetic defects. HSC gene therapy clinical 
studies have resulted in functional cures for several diseases, but in some studies clonal expansion 
or leukemia has occurred. This is due to the dyregulation of endogenous host gene expression 
from vector provirus insertional mutagenesis. Insertional mutagenesis screens using replicating 
retroviruses have been used extensively to identify genes that influence oncogenesis. However, 
retroviral mutagenesis screens can also be used to determine the role of genes in biological 
processes such as stem cell engraftment. The aim of this review is to describe the potential for 
vector insertion site data from gene therapy studies to provide novel insights into mechanisms of 
HSC engraftment. In HSC gene therapy studies dysregulation of host genes by replication- 
incompetent vector proviruses may lead to enrichment of repopulating clones with vector 
integrants near genes that influence engraftment. Thus, data from HSC gene therapy studies can be 
used to identify novel candidate engraftment genes. As HSC gene therapy use continues to 
expand, the vector insertion site data collected will be of great interest to help identify novel 
engraftment genes and may ultimately lead to new therapies to improve engraftment. 
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Introduction 

Gene therapy using hematopoietic stem cells (HSC) has enormous potential to treat diseases 
of the hematopoietic system including immune diseases. In this approach, HSCs are 
collected from a patient, gene-modified ex vivo using integrating retroviral vectors, and then 
infused into a patient. To date retroviral vectors have been the only effective gene delivery 
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system for HSC gene therapy. This is primarily due to the ability of retroviral vectors to 
efficiently integrate into the genome, thereby allowing efficient transmission of therapeutic 
transgenes to all HSC-derived cells via mitosis. Gene delivery to HSCs using integrating 
vectors thus allows for efficient delivery to HSC-derived mature hematopoietic cells. 

Retroviral vectors have been used successfully in HSC gene therapy clinical trials for 
several genetic diseases including X-linked severe combined immunodeficiency (SCID-X1) 
[1,2], adenosine deaminase deficiency (SCID-ADA) [3,4], chronic granulomatous disease 
(CGD) [5], and adrenoleukodystrophy (ALD) [6]. HSC gene therapy also has the potential 
to treat acquired diseases of the hematopoietic system such as human immunodeficiency 
virus infection and acquired immunodeficiency syndrome (HIV/AIDS) [7]. While recent 
clinical studies have shown promise, the use of retroviral vectors for gene therapy has 
drawbacks. Gene therapy using HSCs with integrating retroviral vectors can dysregulate 
cellular genes near the provirus integration site leading to adverse side effects including 
leukemia [8-10]. 

Previous human clinical studies have documented the impact of vector-mediated 
dysregulation of host genes. In both the French and United Kingdom SCID-X1 studies 
vector-mediated gene dysregulation resulted in the development of leukemia [8-10]. In a 
CGD study conducted by Ott and colleagues, proviral insertion sites led to the clonal 
expansion of gene-modified cells over time [5,1 1]. In this CGD study the vector provirus 
provided the gene-modified HSCs with a survival advantage, leading to the clonal 
dominance of a small subset of gene-modified cells in the patient. In the above SCID-X1 
and CGD studies, the ability to determine where the provirus had inserted into the genome 
allowed for the identification of nearby genes that were dysregulated, leading to clonal 
expansion. The integrated provirus can thus be used as a molecular tag to identify 
dysregulated genes in gene therapy studies. 

Gene-modified HSCs that are infused into patients undergo various selective pressures 
during the process of stem cell engraftment. First, the cells must home to the stem cell niche 
and resist apoptosis during this process. Once in the bone marrow, HSCs begin the 
production of all hematopoietic cell lineages which involves survival, stem cell self-renewal, 
proliferation and differentiation. Together, these processes are referred to as engraftment 
[12], and many genes could potentially provide a selective advantage to repopulating cells if 
dysregulated. The gene-modified cells that are infused into a patient are a polyclonal 
population, where different cells have vector proviruses integrated at different chromosomal 
locations. There may be millions of clones that are infused into a patient and this polyclonal 
population of cells is, in essence, a library of clones with many different unique integration 
sites. If a clone has a vector integrant near a gene that may influence the efficiency of 
engraftment, that clone has a selective advantage and may be over-represented when 
engrafted cells are analyzed (Figure 1). Thus, pre-clinical and clinical HSC gene therapy 
studies provide an opportunity to identify genes near vector proviruses in over-represented 
clones. These genes may have conferred an increased survival and proliferation advantage to 
the infused cells due to dysregulation mediated by the integrated provirus. 
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This review covers the potential of HSC gene therapy studies to identify genes that play a 
role in engraftment. The use of retroviral mutagenesis screens to identify dysregulated genes 
involved in cancer has provided an enormous wealth of data [13]. These screens have been 
used to identify genes that have an effect on the development and progression of leukemia 
by analyzing replicating virus insertion sites to identify nearby genes that contributed to 
tumorigenesis and leukemic development [14]. However it is clear that non-replicating 
viruses can also perturb nearby genes causing genotoxicity. Thus, HSC gene therapy studies 
are de facto mutagenesis screens where a library of vector-mutagenized cells are infused into 
patients and clones with a selective advantage to engraft can become over-represented. 
Although the goal of clinical gene therapy is to develop cures for life-threatening diseases, 
the data obtained from patient samples can also provide information into the role of genes in 
hematopoietic processes. Analysis of retroviral integration sites in preclinical and clinical 
HSC gene therapy studies has the potential to identify novel genes involved in engraftment, 
and also other hematopoietic processes. Identifying novel engraftment genes can improve 
our understanding of this complex process, and also identify new therapeutic targets to 
enhance engraftment. 

Selective Pressure During Transplantation in Gene Therapy Studies 

HSCs are commonly harvested from the peripheral blood after mobilization. In order to 
mobilize HSCs from the bone marrow into the peripheral blood, patients receive 
recombinant human granulocyte-colony stimulating factor (G-CSF). The patient's peripheral 
blood is collected and enriched for HSCs using the CD34 + marker. HSCs are then cultured 
ex vivo and exposed to viral vectors. The ex vivo culture period varies between studies, but 
can be for approximately 1-4 days. During this time, vector proviruses integrate into the 
host genome, leading to a polyclonal population of HSCs that possess numerous proviral 
insertion sites. This highly polyclonal population of repopulating cells with vector 
proviruses at many integration sites is in essence a library where there is the potential to 
dysregulate a wide variety of genes. Some proviral integration sites may become over- 
represented during ex vivo culture due to a proliferative/survival advantage of clone(s) with 
this provirus. 

Prior to the infusion of gene-modified HSCs, patients may be treated with chemotherapy 
agents or irradiation to help enhance the engraftment efficiency. Gene-modified HSCs are 
re-infused into the patient intravenously. The cells migrate into the bone marrow before 
finally residing in the sinusoids and perivascular tissue [15,16]. Both homing and 
hematopoiesis are integral aspects of engraftment. Cells that have reached the stem cell 
niche through homing will begin producing mature myeloid and lymphoid cells from each 
blood lineage. Hematopoiesis continues through the action of long-term HSCs, which are 
capable of self-renewal for life-long generation of the patient's mature blood cells. 

When HSCs are infused into the patient intravenously, the cells must travel from the 
peripheral blood into the bone marrow, eventually reaching their niche to repopulate the 
blood system. This process, known as homing, is a multistep process that relies on the action 
and interactions of various chemokines, cytokines and other proteins. Examples include 
stromal derived factor 1 (SDF-1) and CXCR4, adhesion molecules such as very late antigen 
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4 and 5 (VIA- 4/5), lymphocyte function associated antigen 1 (LFA-1), and ct4pi integrin 
interaction with vascular cell adhesion protein 1 (VCAM-1) [12,17-19]. Circulating HSCs 
roll and tether to the blood vessel walls through the action of E-and P-selectins and 
VCAM-1. Tethered HSCs extravasate through the bone marrow endothelium before lodging 
into the bone marrow stem cell niche (Figure 2) [15,20-23]. The entire process is thought to 
occur within a matter of hours following infusion [12]. HSCs with vector provirus insertions 
near genes that enhance homing are more likely to engraft and thus these clones may 
become over-represented during this process. 

After reaching the bone marrow and lodging in the perivascular region, HSCs begin the 
process of repopulating the patient's blood system. During the process of proliferation, some 
of the daughter cells produced by the infused HSCs remain as quiescent HSCs, while others 
self-renew or become committed to either the myeloid or lymphoid system as progenitor 
cells [24]. As gene-modified daughter cells divide, they begin to produce all of the cellular 
subsets of each lineage, with all progeny carrying the transgene of interest. For HSCs that 
harbor proviral integrations near genes involved in stem cell renewal or expansion, 
dysregulation may provide the HSCs with an engraftment advantage. HSCs that have 
vectors integrated near genes that provide a selective advantage during these processes of 
self-renewal or expansion will be more likely to engraft, repopulate, and persist in the 
patient long-term. Examples of such genes include RUNX1 [25-27], globin transcription 
factor 2 (GATA2) [28,29], spleen focus forming virus proviral integration oncogene (Spi-1), 
the transcription factor PU.l [30,31], as well as homeobox A (HOXA) [32,33]. 

HSC clones that have vector proviral insertions that dysregulate genes involved with 
proliferation or survival have a selective advantage at all stages of engraftment. In order for 
infused cells to engraft and repopulate the patient's blood system they must make it to the 
bone marrow without undergoing apoptosis. Dysregulation of genes that confer a survival 
advantage by inhibiting apoptosis, such as MCL1, could benefit HSCs prior to reaching and 
after lodging in the bone marrow niche [34,35]. Clones with dysregulated genes that provide 
a proliferative advantage to HSCs, such as CCND2, have been over-represented in gene 
therapy studies. 

Retroviral Genotoxicity 

Integrated vector proviruses have the potential to dysregulate the expression of nearby host 
cell genes flanking the integration site [36]. Depending on the integration site of the 
provirus, vector-mediated genotoxicity can lead to gene over-expression, inactivation, or 
production of novel gene transcripts (Figure 3). Transcriptionally active LTR regions with 
strong promoters or enhancers are important in the development of genotoxicity. Integrating 
replication-competent retroviruses are well known for their potential to activate nearby 
genes leading to oncogenesis. However, it was previously believed that replication- 
incompetent viral vectors might not mediate significant genotoxicity. Unfortunately, clinical 
studies have shown that replication-incompetent vectors still cause genotoxicity, in some 
cases leading to clonal expansion and leukemia. 
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Identification of Retroviral Integration Sites and Nearby Dysregulated 
Genes 

In order to identify integration sites, genomic DNA is extracted from the bone marrow or the 
peripheral blood of patients that have received gene-modified HSCs. After isolation of the 
DNA, the amplification of provirus LTR-chromosome junctions is commonly conducted 
using ligation-mediated PCR (LM-PCR) or linear-amplification-mediated PCR (LAM-PCR) 
[37]. LM-PCR utilizes frequent cutting restriction enzymes that cut genomic DNA into 
small fragments. Some of these fragments contain an LTR-chromosome junction. Following 
digestion, these fragments are then ligated to linkers and PCR amplified. LAM-PCR 
employs linear amplification of LTR-chromosome junctions followed by double-stranded 
DNA (dsDNA) synthesis. The dsDNA sequences are then digested with restriction enzymes, 
followed by linker ligation to the sequence and nested PCR. Non-restrictive linear- 
amplification- mediated PCR (nrLAM-PCR) has been developed which avoids restriction 
digest bias of recovered integration sites [38]. 

Alternative non-PCR methods, such as shuttle vector rescue, also exist [39]. In shuttle vector 
rescue, integrated vector proviruses contain a bacterial origin of replication and a selection 
gene. Peripheral blood DNA from patients is digested with restriction enzymes or randomly 
sheared, ligated, and then transformed into bacteria which are grown as colonies. These 
plasmids contain an LTR-chromosome junction that can be sequenced with an LTR specific 
primer. Shuttle vector rescue avoids PCR-based skewing of obtained integration sites. 

The availability of the human genome sequence, as well as the genomes of other model 
organisms such as mice and macaques has allowed for rapid identification of genes near 
vector proviruses in clinical and preclinical studies. Following sequencing of the LTR- 
chromosome junction, sequence reads can be aligned to the human genome using the 
BLAST-like alignment tool (BLAT) [40]. Genes and oncogenes located close to the vector 
integration site can be identified based on the annotation of the human genome. Thus, 
through the combination of LTR-chromosome junction amplification, next-generation 
sequencing, and bioinformatics, vector proviruses serve as ideal molecular tags to identify 
nearby genes. 

Over-Represented Gene Classes Near Proviruses in HSC Gene Therapy 
Studies 

Proviral vector integration occurs throughout the genome, but different viral vector types 
have different integration site preferences. HIV based lentiviral vectors favor active genes, 
while murine leukemia virus vectors (MLV) favor transcription start sites [41,42]. Gamma- 
retroviruses, such as MLV, have a strong preference for integration sites involving 
previously identified common integration sites (CISs) in the retroviral tagged cancer gene 
database (RTCGD) [43]. The RTCGD is composed of retroviral integration site data 
acquired from mouse tumors from a variety of different studies and tumor types [44] . The 
RTCGD allows researchers to identify candidate cancer genes dysregulated by proviruses 
that may play a role in human cancer development and progression [13,45]. 
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HSC gene therapy trials utilizing MLV and lentiviral vectors have shown that proviral 
insertions are observed in specific classes of genes [46-50]. Both MLV and lentiviral vector 
proviruses are over-represented near genes involved in the establishment and/or 
maintenance of chromatin architecture, signal transduction, and cell cycle [51]. Lentiviral 
vector proviruses were also over-represented near genes involved in chromatin remodeling 
and phosphorylation. Many of the genes identified in retroviral mutagenesis screens are 
linked in gene networks involved in cellular regulatory process such as apoptosis, signal 
transduction, and transcriptional regulation [51]. The over-representation of vector provirus 
near genes involved in such processes is likely due to the survival and proliferative 
advantages that such mutations could confer to HSCs. For example, in a retrospective study 
of vector integration sites in rhesus macaques that had received autologous MLV transduced 
hematopoietic repopulating cells, the MDS/EVI1 site was identified as a hot spot of vector 
insertion [52]. It is likely that vector provirus dysregulation of this locus provided the 
infused cells with the potential for increased survival, proliferation, or both. Studies have 
shown that the overexpression of EVI1/MDS1 has the potential to delay or inhibit the 
myeloid differentiation of HSCs, while increasing the proliferation of HSCs and progenitor 
cells [53]. This has also been reported for mouse and monkey HSCs [46,54,55]. Proviral 
integration leading to the dysregulation of the EVI1/MDS1 gene complex can lead to the 
over-expression of either or both genes, inhibiting cellular differentiation. Dysregulation of 
this locus has been shown to be involved in clonal expansion and leukemic development, 
with integration sites likely providing a survival or proliferation advantage to transduced 
HSCs. Extended culture of macaque HSCs revealed an increase in HSC clones with 
integration sites in or near the EVI/MDS1 locus compared to other infused clones [55]. 
Thus, analysis of vector provirus integration sites can provide evidence for dysregulated 
genes in the absence of adverse events. This data demonstrates the potential of preclinical 
gene therapy studies to identify genes involved in engraftment and hematopoiesis pathways, 
as well as their role in gene networks related to these processes. 

Engraftment Genes Identified in Retroviral Mutagenesis Screens 

Retroviral mutagenesis screens have played an important role in determining genes involved 
in hematopoiesis. Forward retroviral mutagenesis screens in hematopoietic cells have been 
highly successful in identifying genes involved in migration, proliferation, and expansion. 
Identified genes include Rac2, JaklStat, and Notch [44,56]. Notch expression is important in 
embryonic development, and throughout life for tissue homeostasis, [57]. Dysregulated 
expression of Notch can affect HSC cell differentiation and lead to skewed differentiation of 
hematopoietic lineages [58]. A study of murine tumor retroviral insertion sites by Suzuki 
and colleagues identified Notch as a CIS [59]. Based on the role of Notch family genes in 
hematopoietic differentiation, dysregulation by proviral insertional mutagenesis has the 
potential to enhance hematopoietic repopulation. 

The role of GATA proteins, especially GATA-1 and GATA-2, is also well established in 
HSC biology. Both are highly expressed in erythroid precursors. As cells differentiate the 
GATA-2 level decreases while GATA-1 expression is maintained at high levels [60]. 
GATA-2 expression is essential for HSC maintenance, survival, and proliferation 
[29,61,62]. Since GATA-2 expression levels are important in HSC proliferation and 
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differentiation, dysregulation of GATA-2 expression would be expected to enhance 
engraftment following transplantation of gene -modified cells. GATA2 has in fact been 
identified as a CIS [28]. 

Engraftment Genes Identified in Gene Therapy Studies 

Replication-competent retroviruses cause insertional mutagenesis, leading to their common 
use in mutagenesis screens [13,63]. Although replication-incompetent vectors are capable of 
providing only a single-hit genetic modification via provirus integration, they can still be 
utilized to identify dysregulated genes. Deichmann and colleagues investigated integration 
site data from five clinical gene therapy trials and three pre-clinical trials [64]. This 
retrospective analysis showed that transplanted gene-modified HSCs had very similar 
integration sites and dysregulated genes. The most frequent CISs were insertions that would 
dysregulate genes leading to clonal expansion or leukemic development, such as LM02 and 
MDS1/EVI1. Thus, the same dysregulated genes are often observed in multiple gene therapy 
studies. 

The CGD study by Ott and colleagues revealed that the dysregulation of PRDM16 and 
EVI1/MDS1 caused clonal expansion [5]. The expression of PRDM16 has been shown to be 
involved in HSC maintenance and renewal. Cells lacking expression of PRDM16 exhibit 
increased cell death so overexpression is expected to lead to an over-representation of clones 
with pro viral insertions near PRDM16. PRDM16 may be in a gene network involving 
MDS1/EVI1, GATA2, and other genes that affect HSCs [65,66]. Thus, dysregulation of the 
PRDM16 gene locus could have an effect on the signaling pathways for other genes 
involved in normal hematopoiesis, expanding the effects of dysregulation of the PRDM16 
gene. During the French SCID-X1 study the dysregulation of LM02 likely led to the 
proliferation of common lymphoid progenitor cells. Over time, dysregulation of LM02 led 
to the expansion of the lymphoid hematopoietic lineage. LM02 is expressed only in the 
earliest stages of lymphopoiesis, with the continued expression in mature T-lymphocytes 
leading to the development of lymphoblastic leukemias. In the French SCID-X1 study, the 
dysregulated expression of LM02 ultimately resulted in lymphoblastic leukemia [67,68]. 

With the large proviral integration site data sets that gene therapy trials can provide, the 
ability to quickly and efficiently analyze the integration profiles sites should help to identify 
candidate engraftment genes. One such utility is the QuickMap utility provided by the gene 
therapy safety group (GTSG) [69]. The QuickMap utility relies on cancer gene lists 
provided by the Catalogue of Somatic Mutations in Cancer (COSMIC) [70] as well as the 
RTCGD. The QuickMap utility is able to rapidly analyze sequence data from LTR- 
chromosome junctions to determine the proviral integration site. Once the integration site is 
known, it can identify if the vector provirus is within a gene including known oncogenes, 
within a CpG island, or in a repetitive DNA sequence. Further, the software compares the 
integration site data to a randomly generated data set of one million integrations as a control. 
This database has been utilized previously with ex vivo transduced human cells to explore 
the effect of chemoselection of HSCs on integration site patterns [71]. Within the analyzed 
data, two of the sixteen CISs identified, STAT5B and TNRC6C were previously identified 
as CISs. 
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One limitation of the QuickMap utility is the submission limit of fifty- thousand sequences 
per analysis, although this limit can be temporarily increased by contacting the GTSG. As 
many gene therapy studies now use next generation sequencing where sequence reads can be 
in the hundreds of thousands to millions this restriction may limit future use of QuickMap. 
With the increasing number of pre-clinical and clinical HSC gene therapy trials, the 
development of new utilities to efficiently analyze millions of integration site sequence 
reads from next generation sequencing may aid in the discovery of additional CISs in HSCs. 
These CISs may in turn identify novel engraftment genes. 

A Preclinical Gene Therapy Study Identifies Candidate Engraftment Genes 

A study by Kiem and colleagues of three baboons that received baboon hematopoietic 
repopulating cells exposed to a gammaretroviral vector revealed a CIS of 664 base pairs in a 
CpG island that existed between zinc finger protein 91 (ZFP91) and leupaxin (LPXN) [72]. 
It was hypothesized that the CIS between ZFP91 and LPXN lead to the dysregulation of one 
or both genes, providing the HSCs with an engraftment advantage. Thus these two genes 
may play a role in normal engraftment pathways. This study suggests that other HSC gene 
therapy trials may identify CISs that are near genes including micro- RNAs previously not 
associated with engraftment. Previous studies regarding the roles of genes involved in the 
engraftment process have been utilized to improve HSC transplantation [73]. These 
identified genes could serve as targets for novel small molecule drugs to increase the gene 
expression of the identified targets prior to HSC infusion. These drugs could be of benefit to 
patients receiving any type of HSC transplantion, and may be of significant value in the field 
of cord blood transplantation where low cell numbers and low engraftment limit clinical use 
[74]. 

Use of Retroviral Vectors to Identify Genes Involved in Other Biological 
Processes 

The ability of retroviral vectors to dysregulate genes can be exploited to better understand 
many other biological processes. If a library of cells mutagenized with retroviral vectors is 
placed under any selective pressure, those clones with integrants near genes that provide a 
selective advantage will be enriched. For example, it should be possible to analyze over- 
represented genes in specific lineages of hematopoietic repopulating cells. If a set of genes is 
overrepresented near vector proviruses in myeloid but not lymphoid repopulating cells those 
genes are candidates for affecting myeloid differentiation and expansion. There are many 
possible uses of this technology. Replication-incompetent vectors have been used to identify 
genes involved in liver cancer [75] and we are using this approach to study the development 
of acute myeloid leukemia (GDT unpublished data). Further, retroviral integration sites 
could provide insight into genes that play a role in the metastasis of solid tumor cells to the 
bone marrow, such as in prostate cancer [76]. Analysis of gene expression in cancer cells 
that have metastasized to the bone marrow could provide insight into genes that helped them 
engraft in the bone marrow. Identification of genes that assist in homing and engraftment 
would be potential molecular targets to reduce the likelihood of metastasis to the bone 
marrow. Thus, the data obtained from mutagenesis screens using replication-incompetent 
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vectors should provide useful information about the physiological role of genes and their 
interactions in gene networks for many biological processes including cancer. 

Summary 

Pre-clinical and clinical trials utilizing HSCs with retroviral vectors have yielded important 
information regarding the effects of retroviral insertional mutagenesis on host genes. 
Through the use of annotated genomes for humans and model organisms, retroviral insertion 
sites in gene therapy trials can be mapped to the genome to determine nearby potentially 
dysregulated genes. Advances in bioinformatics, and the creation of cancer gene databases, 
such as the RTCGD, have been instrumental in identifying CISs and thus dysregulated 
genes. 

As the number of HSC gene therapy trials increases more data regarding the role of genes in 
biological processes will be obtained. The data from these studies can be mined to identify 
genes that provide a competitive engraftment advantage to infused HSCs. Studies without 
observed abnormal hematopoiesis following engraftment still have the potential to identify 
genes that have an effect on hematopoiesis and engraftment. Novel engraftment genes might 
be targeted with small molecule drugs to increase the engraftment efficiency of infused 
HSCs. Therefore, HSC gene therapy trials carry the potential to improve HSC 
transplantation by providing data that identifies genes and gene networks involved in 
engraftment and hematopoietic pathways. 
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population of gene 
modified HSCs 



Post-infusion: 
Engrafted clones with provi ruses 
near genes involved in engraftment 
are now over-represented. 




Figure 1. Selective pressure for HSCs to engraft enriches for clones with proviral integration 
sites that confer an engraftment advantage 

After harvesting patient HSCs the cells are transduced with retroviral vectors, leading to a 
polyclonal population of cells with numerous different proviral insertion sites. Following 
transfusion of the cells into the patient, cells with insertions near genes that confer a 
competitive engraftment advantage (red, purple clones) will become enriched. Provirus 
vector integration sites in the purple and red cells are thus over-represented. 
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Figure 2. Infused HSCs must home to their bone marrow niches before they can begin the 
process of hematopoiesis 

After infusion of HSCs into the peripheral blood, shown as purple circles, HSCs begin the 
process of homing to the marrow. E- and P-selectins and VCAM1 on the vessel walls tether 
circulating HSCs and allow for rolling on the vessel wall to occur. This is followed by 
extravasation of the HSCs through the extracelluar matrix into the bone marrow. The release 
of SDF-1 from osteoblasts and epithelial tissues in the bone marrow binding to the HSCs 
CXCR4 receptors is important. After reaching the bone marrow, HSCs then migrate to the 
perivascular regions and begin the process of hematopoiesis. 
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Figure 3. Mechanisms of insertional mutagenesis 

(1) 3' proviral LTRs can drive over-expression of nearby genes. (2) Enhancers in the LTRs 
can activate nearby promoters leading to increased transcription. (3) Proviral insertion 
within a host gene and transcription from the 5' LTR can lead to the creation of novel gene 
transcripts. (4) Premature polyadenylation of host cell gene transcripts can be caused by 
proviral insertion within a gene. Black boxes represent the host gene promoter and grey 
squares represent the exons. Grey boxes containing white rectangles represent proviral LTRs 
and striped rectangles are used to show proviral transgenes. 
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